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ABSTRACT 


Time shared systems permit the fixed costs of computing resources to be 
spread over large numbers of users. However, "bottleneck" results in the 
theory of closed queuing networks can be used to show that this economy of 
scale will be offset by the Increased congestion that results as more users 
are added to the system. If one considers the total costs, including the 
congestion cost, there is an optimal number of users for a system which 
equals the "saturation" value usually used to define system capacity. 


Designers of time shared systems face a trade-off between the economies 
of scale associated with sharing computing resources and the congestion 
resulting from users' contention for them. Congestion increases the users' 
total costs because their time could be used for other r*urposea than console 
interaction. This paper explores this trade-off using results from the 
theory of closed queuing networks.^ This theory has been developed and 
validated many times in the last decade and more. When combined with a 
simple model of costs, the theory shows that in the short run — i.e., when 
the system's configuration is fixed — a curve of average total cost per user 
plotted against the number of users supported is U-shaped. The minimum 
point of this curve determines the number of users N* that can be served 
at lowest cost by this configuration. Moreover, N* is equal to the 
"capacity" given by "bottleneck" models, e.g., [ 4 ], [ 10 ] ,{ 14 ] , [ 15 ] • That 
is, it is always economic fully to utilize the bottleneck server. 

System Cost Model 

This paper employs a general definition of "cost" used in economics. 
That is, the cost of a good or service measured by the value given up when 
it is used for one purpose rather than another. As implied by this defini- 
tion, cost is a way of comparing alternatives and not an intrinsic property 
of a good or service. While money is sometimes a convenient metric for 
costs, goods or services which have no observable money price may still be 
costly. Here we consider two cost elements which are measured in money 
terras directly and a third element, user time costs, that can be expressed 
indirectly in money terms. 
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The first type of money costs are so-called "fixed" costs, which cannot 
be varied from one day to the next. Fixed costs depend on the system con- 
figuration, including, for instance, equipment rental, power, heat, space 
and labor. However, in the long run these costs can be varied by changing 
configuration or location or by hiring and firing employees. 

"Variable" costs, which depend on the output of the system, are the 
second type of money cost. This paper uses a model with a single class of 
users. Therefore, output can be measured in units of a homogeneous commod- 

o 

ity called "computing services." One unit of computing services is com- 
prised of a bundle of goods and services which are consumed by one user. 

The next section shows how to calculate an index of computing services from 
the parameters of a queuing network model. In general terras, a unit of 
computing service includes computing power (measured by CPU cycles, memory 
allocation, and so forth), communication facilities, and user support in 
appropriate proportions. 

Because of congestion, however, the number of units of service 
delivered to the system's users during some time period (i?e., the through- 
put) varies. More service is delivered per hour per user v/hen the system is 
lightly loaded than when it is heavily loaded. Rather than measure output 
(i.e., computing services) so that it varies with congestion, we will take 
the variable costs to be proportional to the number of users, N , and work 
with N parametrically. Thus, if variable costs per user per hour, such as 
terminal and line costs, are c , the variable costs per hour will be cN 
while throughput is (say) X(N) units of computing service per hour. 
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The third major source of costs is not always expressed in money 
terms. This is the cost to users' of the time spent interacting with a time 
sharing system. Time is costly because the time spent using a time sharing 
system almost always could be put to some other productive use. Each user's 
time cost can be thought of as the product of (a) the time required to per- 
form some task on the system and (b) the money cost of a unit of the user's 
time. In our model, we let w represent the money price of a user's time. 

Time costs often show up as productivity losses. For example, consider 
an on-line order processing application. (Processing one application 
requires a fixed number of units of computing service.) Hypothetically, 
suppose it takes one second of machine time to process an application. The 
total time required to process the application includes this time, plus the 
"think time" of the human operator. If the think time is, say, 20 seconds 
the total time per order processed will be 21 seconds, and the rate at which 
orders are processed is 3600/21 - 171 per hour. The cost per order 
processed is clearly w/171 * w(21)/(3600) , where the units of w are 
dollars per hour. Now, suppose congestion Increases the system's response 
time, so that it takes, say, 30 seconds to deliver the same amount of 
processing capability (i.e., 1 second). The order processing rate drops to 
3600/30 » 120 per hour, which we see as a loss of productivity because the 
cost per order is now w/120 . 


A 
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Delay In a Ti-mg Sharing System 


The above example illustrates the general point that each user's time 
costs are proportional to the time required to deliver one unit of computing 
services.^ We now consider how long this takes, using the so-called 
"bottleneck," results of Muntz and Wong [14] and Chang and Lavenberg [4] for 
closed queuing network models. These results provide -a simple, very general 
relation between the speed of the syslem, the number of users, and the time 
required to provide a given amount of service using the system. We summar- 
ize the model and the results we need as follows. 

A time sharing system with N users can be thought of as a network of 
multi-server queues. Users are queued or in service at one node of the net- 
work, and are dispatched to the next node when they finish service according 
to fixed probabilities. One of these nodetr, (with one server for every user) 
represents the users' terminals, others might be the central processing 
facility (which might have several CPUs in it), or I/O devices of some 
kind. 

Formally, let v^ be the number of visits at node i for every 
visit to the terminal node, 1/pj^ be the service requirement at node 1 
per visit , and m^ be the number of servers at node i . To Include 
the infinite-server node representing terminals, let its index be 1, with 
v^ » 1 and let 1/p^ be the user "think time." At each node, = '^i'^^1 
is the total service requirement. The sum T(l) = ^ 2 ^ . . . + represents 
the mean total time a user who never has to queue spends in the system, and 
implicitly defines a unit of computer services. The mean time required to 
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deliver a unit of computer services when there is only one user (i.e., when 
N “ 1 ) is therefore S( 1) - + . . . + - l/p^ + T( 1) . 

Now, for any N we can calculate the thoughput X(N) and the mean 
response time T(N) . The mean time to deliver one unit of computer ser- 
vices when there are N users in the system is S(N) “ T\N) + l/p^^ 

(i.e., the response time plus the think time). The cycle time S(N) 
measures the average real time required to deliver one unit of computing 
services. The cost per unit delivered is thus (w/3600) S(N) . However, 
the number of units of service delivered in an hour is 3600 X(N) . However, 
Little's Result N ■ X(N)S(N) for this system. Hence, user time costs are 
wN per hour when there are N users and the throughput is X(N) . 

The Bottleneck Model [4) , [ 11] , [ lA) , [15] 

When N is small the cycle time is approximately S(l) for each user, 
and throughput is approximately X(N) N/S(l). When N is large we may 
concentrate on the "bottleneck" node in the system, with a relative utiliza- 
tion higher than that of any other node (ignoring possible ties) . 

When N is large the rate at which user service is completed at the bottle- 
neck node equals the number of busy servers ra^ times the service rate 
for a server l/p^ » ‘ rate, must equal the rate at 

which users enter the bottleneck, Which also must equal the throughput 
X(N). Solving for the cycle time gives: 

X(N) - N/S(N) » 

nr: S(N) - 
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That is, the cycle time increases with N for large N , and with the 
average service requirement . It decreases as more processing 

capacity is devoted to user jobs, that is as ra^ increases. The 
throughput X(N) is a constant determined by the capacity of the bottleneck 
node. 

Now, when N is large S(N) - Nv^/Cm^^ij^) , while when N is small 
S(N) is approximately S(l) “ + . . . + . Equating these two 

expressions gives a critical value for N, N* t 

N* S(l)m^^Pjj/v^ 


m. 


n 
i“-T 


(3) 


When N > N* the system, in Kleinrock’s terminology [11], is 
"saturated," and the cycle time grows in proportion to N . On the other 
hand when N < N* the cycle time is more or less constant, and its lower 
bound is S(l) . N* can be thought of as the number of simultaneous users 
that can be accomodated without queueing if they were each given exactly 
S(l) seconds of service. 

The total costs of time sharing 

The three cost coraponents—f ixed costs, variable costs and time 
costs — must be combined for decision making purposes. We call the first two 
cost elements together the private cost. The private cost is F + cN when 


there are N users on the system. F is the fixed cost (in dollars per 
hour, say) and c the unit variable cost (in dollars per user per hour). 

There are N users on the system, each incurring a cost wN per 

2 

hour. Hence, total time cost for r.H users is wN , and the total cost 
Is! 

0(N) - F + cN + wN (4) 

The units of C(N) are dollars per unit time. For this value of N, 
the throughput X(N) gives the number of units of service provided per unit 
time. However, conceptually a cost function is a function of output, 
written e.g., C(X) . Since X(N) la an increasing function, we could in 
principle invert it and find this function. But there is no analytic 
expression known for X(N) , so this would net be a particularly insightful 
way to proceed. Instead we use the "bottleneck" approximation for X(N) 
and S(N) , and then look at the resulting approximate cost function. 

Costs in the Bottleneck Model 

The general expression for the average cost per unit of computing 
service delivered is found by dividing Equation (4) by X(N) : 

A(N) - F/X(N) + (c + w)S(N) (5) 

In terms of the bottleneck model, this is; 

(F/N + c + w)S(l) N < N* 

(F + (c + w)N)/(m^p^^/v^) N N* 
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A(N) 


( 6 ) 


Evidently, Equation (6) has a minimum at N* since it i® decreasing with N 
when N _< N* and increasing when N ^ N* . Thus N* measures the system's 
economically efficient operating point. 

The explanation for this result is that when N < N* there is idle 
capacity at the bottleneck node. In the bottleneck £;.odel congestion does 
not begin increasing until N ■ N* , so adding more users spreads the fixed 
costs over a larger number of users without Increasing congestiim costs. 
However, when N > N* adding another user adds to everyone else’s collec- 

C 

tive delay, thereby increasing costs. 

If we could express X(N) analytically, wo could find an approximate 
value for the minimum by treating N as a continuous variable and differ- 
entiating. Thus, A'(N) « 0 would imply X(N)/X'(N) - N - F/(c + w). 
Admittedly, computing X(N) is not computationally difficult, so finding an 
exact minimum is fairly easy if the parameters of the queuing network model 
already have been determined. However, the bottleneck model is often used 
for rule-of-thumb calculations and the cost model used here is presented in 
a similar spirit. 

An Example 

To illustrate how well the bottleneck approximation works in a cost 
function, we extend an example used by Denning and Buzen [7]. This example 
has three devices (CPU, drum, and disk) with the queuing network parameters 
shown in Table 1. In this example the CPU is the bottleneck node, with 
Vb/(mbPj^) “ 1 second. The total time required to deliver a unit of service 
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is T(l) "2.2 seconds,;, and the think time is 20 seconds. Hence S(l) ■« 22.2 
seconds and N* - 22.2 users. Figure 1 shows the value of S(N) calculated 
by the bottleneck approximation (solid line), as well as the exact solution 

« 

(dashed lines). 

Inspection of Equation (3) shows that the true minimum of cost per unit 
of service depends only on the cost ratio F/(c + w) . Table 2 shows the 
location of the minimum cost point for values of this ratio between 5 and 
50.® As can be seen, although the cost ratio varies by a factor of ten, the 
true minimum stays close to the approximation N* *22.2 . This implies 
that the bottleneck approximation can be used to specify the system’s capac- 
ity without causing large errors. 

Also, the exactly computed cost curve Is flat in the region of the 
minimum. This is also shown in Table 2, where the range of N for which 

ft 

unit costs are within 5% of the minimum is shown. This range includes N’’’ * 

« 22.2 in all four cases. Figures 2 and 3 show the full cost curve for the 

first and last cases shown in the table. These figures also illustrate the 
fact that cost curve is flat near its minimum value. 


ft 
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Table 1 


Queuing Network Parameters Used in Example 


1 , Node 


Number of Visit Mean Service Mean time 

Servers, ra^ Ratio, v^ Tdrae,l/p^ (sec) required per cycle 


1. Terminals N 

2 . CPU 1 

3 . Disk 1 

4. Drum 1 


1 20 
20 0.05 

11 0.08 

8 0.04 


1.0 

0.88 

0.32 


Table 2 

Minimum Cost Values of N for Example 


Minimum Upper and Lower Values 

F/(c + w) Cost N of N for Costs Within 5% of Minimum 


5 

17 

12 

24 

10 

20 

16 

27 

20 

24 

18 

30 

50 

28 

22 

36 
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Conclusion 


This paper has discussed some of the technical’~econoraic issues caused 
by congestion in time shared systems. The closed queueing network model has 
been used to show how to measure output and costs, and the bottleneck 
approximation derived from that model allows us to express the cost function 
simply. Using the approximation, we find that the minimum cost per unit of 
services delivered to users occurs at the "saturation" value N* , which is 
the user load just sufficient fully to utilize the bottleneck server. Put 
differently, we have shown that it is economically efficient to saturate the 
bottleneck, compared either to under- or over-saturation. Intuitively, this 
is because in an under-saturated system (N < N*) additional users cause 
little congestion but reduce the fixed costs per user. In an over-saturated 
system (N > N*) additional users delay the other users, causing their 
costs to rise. 

Based on the example, this result based on the bottleneck theory 
appears to be a good approximation to the exact queueing network solution. 
This is because the cost function is flat near its minimum. Thus, using the 
bottleneck value of capacity (N*) should provide a quick check on for 
system designers on the economic efficiency of their system. 
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FOOTNOTES 


^ Partially supported by the National Science Foundation’s grants 
GJ 36392X and IST-8108350, and the National Aeronautics and Space 
Administration’s contract NASW 3204. This is a heavily revised version 
of "The Economics of Time Shared Computing: Congestion Costs and 

Economies of Scale," Report No. 12, Program in Information Technology 
and Telecommunications, Stanford Univerity, October 1974. 

1. The special issue of Computing Surveys [9] contains several articles. 
See also [8] and [12]. 

2. Multiple job classes could be introduced, but their presence would not 
change the "bottleneck" results dealt with below except insofar as 
different classes liave different bottlenecks. A multi-class bottleneck 
model (without congestion) is presented in Kriebel and Raviv [13] (see 
also [6] ) . 

3. In some computing environments (e.g., academic) productivity may be 
harder to measure than in this example because it is more difficult to 
see what alternative use of time could have been made. Of course, 
this measurement problem does not invalidate the concept of user coats, 
and the idea that they need to be taken into account. 

4. Time sharing systems have been studied analytically and emprically with 
this model for over a decade. In particular, the machine repair model, 
adapted by Scherr [15] and extended by Greenberger [10] , Kleinrock 
[II], and Adiri and Avi-Itzhak [1] among others, has been used to 
describe the interaction between a central processor and a finite user 
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population. The general queuing network [8] has been used by Buzen 
[ 3 ], Baskett and Muntz [2], and Chang ^ al,. [5] to model systems with 
several sources of congestion. See also the references cited In 
footnote 1. 

5. Because one user by his actions Imposes costs on others the costs are 
said to be external. The general term for phenomena such as congestion 
where these costs are Imposed is "externality." 

6. The values of F/(c + w) were chosen to reflect the probable range of 
costs for the system used in the example. For example, if terminal and 
line charges are considered the primary variable costs, we might take 

c " 3 $/hr. Users with a high value of time might have w * 17 $/hr 
(l.e., c + w “ 20 $/hr) . If the fixed costs are 100 $/hr this would 
give F/(c + w) » 5 . On the other hand, if valuable costs and user 
time costs were very low, e.g., c + w = 5 $/hr , and fixed costs were 
250 $/hr, we would have F/(c + w) = 50 . Intermediate values (e.g., 
c + w “ 10 $/hr and F “ 200 $/hr) also seem reasonable. 
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Figure Captions 


Figure 1 


Figure 2 


Figure 3 


Cycle time vs. number of users for example. 


~ Cost per unit of throughput versus number of users for example, 
F/(c + w) " 5 . 


Cost per unit of throughput versus number of users for example, 
F/(c + w) - 50 . 


- 19 - 


